Reviews for Paper 8: Can We Teach How to Explore Substrates and Systems?

Reviews and Comments

Review 1
PC member	Clayton Lewis
Overall evaluation	This is well-written paper on an interesting and important topic. How can students learn about a substrate-like system that remains open to modification, by them and others? The authors describe difficulties that arise when students try to do this, and a variety of approaches to making the task easier. The authors propose their essay as a basis for conversation, and I think they have done a good job. Their discussion is grounded in real learner experience, and teacher experience, with a real system, Squeak/Smalltalk. I'd like to respond to their invitation with some questions and thoughts inspired by their paper. What special challenges are created by the malleability of substrates? One might be that the substrate as experienced by different users is different, and so has to be understood differently, once users have exploited their ability to modify the system. But in conventional software engineering, the riposte might be, different people have to learn different software there, too, so the challenge of mastering new programs isn't really novel. A response to this could be that in conventional SWE one is conscious of creating a new tool, and hence would be (or should be) aware of the need to document it. A user of a substrate might not have that same awareness. Could facilities be added to the substrate machinery that would encourage documentation to grow organically as the substrate is developed and elaborated? To what extent might this be automated, so that documentation would be created without the author needing to attend to the matter? To what extent does Squeak/Smalltalk succeed in making the implementation of software legible, that is, understandable by reading? Could this be improved? A related question is, to what extent is the runtime behavior of programs understandable at a high level, rather than in terms of a lower level machine model? Can debugging be done using concepts that are available when writing code, or does one need to think about matters differently? Could this be improved? How does the need to explore a system arise? Are the challenges for exploration different for someone seeking to add functionality by building on a substrate than for someone seeking just to use a software tool someone else has built? Can substrate design narrow this gap, if there is one? The paper mentions learner use of language models as an avenue, while noting that LLM support for little-used languages could be sparse. How much do current models actually know about Squeak/Smalltalk? If a learner uploads code they want to work on, (a) how easy is this to do, given how programs are represented in Squeak/Smalltalk, and (b) how good is the support the LLM can provide if this is done? Do other substrates developers have a sense of these matters for their systems? Are some easier for LLMs to support than others, and if so, why? The paper brings out that Squeak/Smalltalk users have a number of search tools available, without it always being clear which to use when, for what. The programming walkthrough could be considered as an approach to understanding the issues there (and potentially other aspects of substrate design)[Bell, B., Citrin, W., Lewis, C., Rieman, J., Weaver, R., Wilde, N., & Zorn, B. (1994). Using the programming walkthrough to aid in programming language design. Software: Practice and Experience, 24(1), 1-25]. In the programming walkthrough an analyst describes the work trace of a hypothetical person working with a system, while attaching to each choice point a workable rationale, "guiding knowledge" that the user has to bring to their work. Ideally, one wants to minimize the guiding knowledge that is required, and a multiplicity of tools that serve related purposes can add to the requirements. Also, thinking through when one would use Tool A rather than Tool B can make it clearer what the documentation needs to convey. (As originally conceived the programming walkthrough was aimed at "bare" languages, written with minimal cues from an IDE, but an extension to take into account cues, as in the related cognitive walkthrough technique, is natural.) A broader view of some of the same ideas would ask, is there a core of knowledge of Squeak/Smalltalk that a teacher can have, that allows them to support learners? Or does this knowledge need to be elaborated indefinitely as new structures are erected on the substrate?

Review 2
PC member	Joel Jakubovic
Overall evaluation	This paper is about the authors' experience teaching Squeak/Smalltalk to their undergraduate students, focusing on issues of explorability, discoverability, and recoverability from errors. It is a welcome contribution to the workshop! However, I would have appreciated a proofreading, as some of the spelling/grammar mistakes made it hard to read in parts. I don't feel particularly qualified to comment on the bulk of the material regarding education; this is a specific skillset and I don't consider myself a teacher. However, I do at least have some experience with Squeak, and I learned it recently enough to be able to use my own experience to evaluate the content of the paper. The most important problem raised is that of "unknown unknowns", as distinct from known unknowns, when you're a novice. If you have an entire afternoon to spare (or several), you could perhaps go on a random exploratory walk through all of the code of such an open system and discover what it is capable of. Technically, it's all there, as the paper points out. But under realistic circumstances, a novice has to be made explicitly aware of useful features that they aren't expecting from their prior experience. Excellent point: how do you know how to ask the right question? I don't know the answer. Indeed, my own self-directed exploration of Squeak was blocked in "where do I even start" until I visited HPI and watched the "people who know that they're doing" demonstrate some key interactions to me. To be fair, for some reason I wasn't aware of the Squeak By Example tutorials - I've since found them to be an excellent introduction and one day, if I have time, I'd like to make a video version (which would happen to better fit my learner profile than written material). Sometimes, habits picked up from prior systems need to be actively suppressed and re-directed. For example, it was amusing to read how some students see an error message with a debugger just a button click away... and then close it without reading anything and do trial-and-error instead. I'm sure this is a reasonable reflex if your experience with error boxes is "these are always useless; I need to make it go away as fast as possible", and your experience with debugging is with very impoverished or complicated infrastructure which isn't worth getting roped into. But speaking for myself, once you've experienced live debugging and fixing in the Squeak debugger, you never want to go back. Equally amusing was the mention of eschewing recovery mechanisms in favor of downloading a fresh image. Again, this is understandable in a world where "turn it off and on again" or, God forbid, "reinstall Windows" might be a learned reflex. Sheepishly, I must admit that I myself am still unconfident on Squeak error recovery mechanisms, so I too am afraid to break the rendering loop etc. This is no doubt due to my occasional use of Squeak; with practice, I'm sure that at some point I would be forced by circumstances to deliberately learn this feature. "Because of inertia, they might not regularly use tools they have only seen once" - indeed. The broader point is that fear of irreversible actions discourages trying things, so the more that "undo"-like operations can be not only supported but advertised to the user's awareness, the better. As humans, we're hardwired to learn by watching competent others and copying them; best to exploit this to the extent possible. If I see someone solving a similar problem to mine 10x faster because they're using shortcuts or tools or whatever - I will feel impressed and inspired to use it myself. In contrast, reading an abstract description in a manual - or even a concrete example in a tutorial book - doesn't feel like watching a competent other, so I find it less psychologically compelling. Now, time for my main criticism: the paper makes some gestures at relating to the concept of "substrates", but I find them unsatisfying. In the Motivation it says that Squeak is a system with many characteristics of a substrate; in S2.2 it says that there are systems with substrate characteristics yet which differ in other aspects. In S5 we read about insights that could help shape "substrates as well as systems in general". This is all rather cryptic! It suggests that the authors have their own idea about what distinguishes a "substrate" from a "system", but are keeping coy about it! Please, tell us what it is :) I think this is one of the most important questions for the workshop; my own submission and another of my review assignments already made me ask it: what, if anything, is the difference between a "substrate" and a "programming system"? I am convinced there should be some difference, otherwise we've wasted a very valuable word. I think part of the answer, which I invite the authors to consider, is prompted by three innocuous-looking words in S2.2: the claim that Squeak the system is as "open as possible". Now, don't get me wrong; I adore Squeak, and I agree that it is extremely open. But it is only "as open as possible" from within its own running VM interface, and this is a definite limitation. For example, with other languages/systems, I can look up source code and docs online; correct me if I'm wrong, but in Squeak, a great deal of useful information (code, comments, docs) is sealed inside the non-Google-indexable image format. I was shocked at the extent to which the wealth of useful documentation (eg class comments) inside Squeak is simply invisible to the outside world - no wonder I had such difficulty orienting myself before I actually used the system. I don't intend this as a damning criticism, just a key difference to notice: I can at least browse and "explore" the named classes and functions in a Python library by finding some Web-hosted documentation. (There is a potential reference to this in S2.2: Squeak is "on the spectrum of low-resource languages; there is less public code, tutorials, and similar available than for other systems". I think this is mainly focusing on low "hegemony", "funding" etc, but the lack of "public code" in the sense I have discussed is an important sign that it could be more open.) The point is that even if we disregard authoring, and stick to a read-only view of objects, classes, and methods in the system, Squeak is open only within the confines of one "viewer" (the VM). And I do think it could be made more open than this; there's some low-hanging fruit to pick. See my dormant project of crafting a declarative Kaitai Struct description of the binary image format, and another for Self images. For the latter case, in the Kaitai online IDE, I can upload a decompressed Self image and do a fair bit of exploring the contents of the image without requiring a full (and running) VM. Because the parsed structure looks like an expandable directory tree, I have further plans to expose it via FUSE as a filesystem interface. Then, the Squeak object universe would be explorable to some extent with a variety of tools other than the VM. know this is only a beginning; not all object graph manipulations are safe or meaningful from the perspective of the higher-level Squeak system. How much of the VM do we want to risk duplicating in the FUSE server? Still, this is my best stab at the difference between a system and a substrate: a substrate is more like a file format with a variety of editors, while a system is one big package deal. On this definition, Squeak could become a substrate if its image contents were readable and writable by more than just a VM. I look forward to exploring and testing this definition in the workshop! And of course, if the authors do have their own inkling about what the difference should be, may they shout it loud and clear! :) (I reiterate from another review that it's OK for something not to be a substrate; I think the distinction would be boring and useless if "substrate" = "system I like")

Review 4
PC member	Luke Church
Overall evaluation	A lovely piece of work that talks about the process of discovery and learning of a substrate - and especially in the context of one that is self-supporting in the Squeak/SmallTalk tradition. There are a number of nice observations made about how this self-sustaining nature affects the dynamics of learning the substrate including that it creates the potential of a closed-cycle learning experience that is significantly easier than if the learning has to be intermediated by an educator. It's an interesting point. The authors outline a number of points of practical experience in their educational practice in these tools. The thing I personally find most intriguing in the work is the relationship with the uniformity and texture of the substrate of Squeak/Smalltalk and how that interacts with the learning process. I see two tensions that I wonder if the authors could speak to in a presentation: 1 - One might imagine that a uniform material was easier to learn because everything is the same, however I've sometimes found these harder to teach, because you don't know where to start, and are in the presence of a large and powerful thing all the time. Whereas an uniform substrate has random eccentricities to it - "oh you can't do that because that's part of the system", which are frustrating, but give the space some structure that helps in getting going. It sounds a bit like the example you described wrt closing the debugging wrt the Matrix class might be an example in this direction, or the balance act in the description of 3.1. I think there might be a fundamental design tradeoff to be explored somewhere in this direction that would be interesting. 2 - How do you teach a sense of what is possible? This applies to non self-sustaining systems too, but when you can modify the tools itself the virtually unlimited potential, and one that is different from most peoples' life experience of computing. Usually again, the bounds on the system form an impression of what might be achieved, without that getting going seems like a very interesting challenge! Looking forward to hearing more of the authors' experiences, and thoughts as to how they might apply to the teaching of other substrates.